

SEQUENCE LISTING 



(1) GENERAL INFORMATION 



(i) APPLICANT: Wallis, Nicola G. 



Li \ 




Shilling, Lisa K. 
Mooney, Jeffrey L. 
Debouck, Christine 
Zhong, YiYi 
Jaworski, Deborah D. 
Wang, Min 
Throup, John P. 



(ii) TITLE OF THE INVENTION: Histidine Kinase 



(iii) NUMBER OF SEQUENCES: 6 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Dechert, Price & Rhoads 

(B) STREET: 4000 Bell Atlantic Tower, 1717 Arch Stre 

(C) CITY: Philadelphia 

(D) STATE: PA 

(E) COUNTRY: USA 

(F) ZIP: 19103-2793 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ for Windows Version 2.0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 



(vii) PRIOR APPLICATION DATA: 
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# 



(A) APPLICATION NUMBER: 

(B) FILING DATE: 



(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Falk, Stephen T 

(B) REGISTRATION NUMBER: 36,795 

(C) REFERENCE/DOCKET NUMBER: GM10127 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 215-994-2488 

(B) TELEFAX: 215-994-2222 

(C) TELEX: 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2201 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

TAATTTAAAA AGCAACTATT GTATAGAAAA ATACAAAATT TAAAATATAT TACCTTATTA 
60 

GAAAAAGTCG TAATATGAGG TGTACAAATG ACGCAAATTT TAATAGTAGA AGATGAACAA 
120 

AACTTAGCAA GATTTCTTGA ATTGGAACTC AC AC AT G AAA ATTACAATGT GGACACAGAG 
180 

TATGATGGAC AAGACGGTTT AGATAAAGCG CTTAGCCATT ACTATGATTT AATCATATTA 
240 

GATTTAATGT TGCCGTCAAT TAATGGCTTA GAAATTTGTC GCAAAATTAG ACAACAACAA 
300 

TCTACACCTA TCATTATAAT TACAGCGAAA AGTGATACGT ATGACAAAGT TGCTGGGCTT 
360 
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GATTACGGTG CAGACGATTA 
420 

ATTCGTGCAA TTTTACGTCG 
480 

ATTGATAAGA ACGCTTTTAA 
540 

GAGTATGATT TACTATATCT 
600 

ATTTTAAATC ATGTATGGGG 
660 

ATAAGATATT TACGAAACAA 
720 

CGTGGCGTTG GGTATGTGAT 
780 

TTACCACGAT GATTACGTTT 
840 

TGAAAGATAC ACTGCATAAT 
900 

ATAATTTATT TCATTCTAAG 
960 

TAGGTAATTT TCAAGAGATA 
1020 

CGAATGATAA CACAGTGAGA 
1080 

TAAAAAAACG CTATAAAGGC 
1140 

ATTTCAAAGG GTATAGCTTG 
1200 

CATTGTATAT CATTGCGCTG 
1260 

GTTATGTATT TTCAACACAA 
1320 

AGATTCGACG AGATGGTTTT 
1380 

ATAATTTAGC AAATACGTTT 
1440 

AAAGACAATT TGTTGAAGAT 
1500 

GTCATTTAAA TTTGATTCAG 
1560 



TATAGTTAAG CCGTTTGATA 
TCAGCCACAA AAGGATATTA 
AGTGACGGTA AATGGCGCAG 
TCTAGCTGAA AATAAAAACC 
TTATAATAGT GAAGTAGAAA 
GTTAAAACCA TACGATCGTG 
ACGATGACAA AACGTAAATT 
GTCACGATAT TTTTGTTTTG 
AGTGAGCTTG ATGATGCAGA 
CCTGTTAAAG ATATATCTGC 
ATTATTTATG ATGAGCATAA 
GTTGAACCAG GTTATGAACA 
ATTGAATATT TAATTATTAA 
TTAATTCATT CACTAGAAAA 
GCATTTGGAG TGATTGCAAC 
ATTACTAAAC CGCTTGTCAG 
CAAAATAAAT TGCAATTAAA 
AATGAGATGA TGAGCCAAAT 
GCGTCACATG AATTACGAAC 
CGATGGGGAA AAAAAGACCC 
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TTGAAGAACT TTTAGCAAGA 
TCGATGTCAA CGGTATTACA 
AAATTGAATT AACAAAAACA 
ATGTTATGCA ACGGGAACAA 
CAAATGTCGT AGATGTTTAT 
ACAAAATGAT TGAAACAGTT 
GCGCAATAAC TGGATTATTG 
TTTAATTATT ATTTTTTTCT 
ACGAAGCTCA AGCGATATTA 
ATTAGACTTG AATGCATCTT 
TAATAAATTA TTTGAGACAT 
CCGTTATTTT GACCGCGTAA 
AGAACCAATT ACAACGCAAG 
TTATGATAAC ATCGTAAAAT 
AATTATAACT GCCACAATCA 
TTTATCAAAT AAAATGATTG 
TACAAATTAT GAAGAAATAG 
TGAAGAATCA TTTAATCAAC 
ACCATTACAA ATTATTCAAG 
AG C AG TAT T A GAAGAATCGT 



TAAATATTTC TATTGAAGAA ATGAATCGTA TCATAAAATT AGTCGAAGAA TTACTTGAAT 
1620 

TGACTAAAGG AGATGTAAAT GACATTTCTT CTGAAGCGCA GACCGTGCAT ATTAATGATG 
1680 

AAATTCGCTC GCGAATACAC TCATTAAAAC AATTGCATCC TGATTATCAA TTTGATACGG 
1740 

ATCTGACATC TAAAAATCTA GAAATTAAAA TGAAACCTCA TCAATTCGAA CAATTATTTT 
1800 

TAATCTTTAT TGATAATGCA ATCAAATATG ATGTGAAGAA TAAGAAAATT AAAGTTAAGA 
1860 

CAAGGTTAAA AAATAAGCAA AAAATAATTG AAATTACAGA TCATGGAATT GGTATTCCAG 
1920 

AGGAAGATCA AGATTTCATT TTTGATCGCT TTTATCGAGT GGATAAATCT CGTTCAAGAA 
1980 

GTCAAGGCGG TAATGGACTC GGATTATCTA TTGCTCAAAA AATCATTCAA TTAAACGGAG 
2040 

i 

GATCGATTAA AATTAAAAGT GAAATTAATA AAGGAACAAC GTTTAAAATC ATATTTTAAT 
2100 

CATGTCTGAG ACGTCAATCA AAGTCATAGG ATCAATTTTT TAAGTACACA TTAGCTGTGA 
2160 

CTAATGTATA AGAACAACTA TAAAACAAAT AAACAGTGGT T 
2201 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 51 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Thr Lys Arg Lys Leu Arg Asn Asn Trp lie lie Val Thr Thr Met 

15 10 15 

He Thr Phe Val Thr He Phe Leu Phe Cys Leu He He He Phe Phe 

20 25 30 

Leu Lys Asp Thr Leu His Asn Ser Glu Leu Asp Asp Ala Glu Arg Ser 
35 40 45 
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Ser Ser Asp lie Asn Asn Leu Phe His Ser Lys Pro Val Lys Asp lie 

50 55 60 

Ser Ala Leu Asp Leu Asn Ala Ser Leu Gly Asn Phe Gin Glu lie lie 
65 70 75 80 

lie Tyr Asp Glu His Asn Asn Lys Leu Phe Glu Thr Ser Asn Asp Asn 

85 90 95 

Thr Val Arg Val Glu Pro Gly Tyr Glu His Arg Tyr Phe Asp Arg Val 

100 105 110 

lie Lys Lys Arg Tyr Lys Gly lie Glu Tyr Leu lie lie Lys Glu Pro 

115 120 125 

lie Thr Thr Gin Asp Phe Lys Gly Tyr Ser Leu Leu lie His Ser Leu 

130 135 140 

Glu Asn Tyr Asp Asn lie Val Lys Ser Leu Tyr lie lie Ala Leu Ala 
145 150 155 160 

Phe Gly Val He Ala Thr He He Thr Ala Thr He Ser Tyr Val Phe 

165 170 175 

Ser Thr Gin He Thr Lys Pro Leu Val Ser Leu Ser Asn Lys Met He 

180 185 190 

Glu He Arg Arg Asp Gly Phe Gin Asn Lys Leu Gin Leu Asn Thr Asn 

195 200 205 

Tyr Glu Glu He Asp Asn Leu Ala Asn Thr Phe Asn Glu Met Met Ser 

210 215 220 

Gin He Glu Glu Ser Phe Asn Gin Gin Arg Gin Phe Val Glu Asp Ala 
225 230 235 240 

Ser His Glu Leu Arg Thr Pro Leu Gin He lie Gin Gly His Leu Asn 

245 250 255 

Leu He Gin Arg Trp Gly Lys Lys Asp Pro Ala Val Leu Glu Glu Ser 

260 265 270 

Leu Asn He Ser He Glu Glu Met Asn Arg He He Lys Leu Val Glu 

275 280 285 

Glu Leu Leu Glu Leu Thr Lys Gly Asp Val Asn Asp He Ser Ser Glu 

290 295 300 

Ala Gin Thr Val His lie Asn Asp Glu He Arg Ser Arg He His Ser 
305 310 315 320 

Leu Lys Gin Leu His Pro Asp Tyr Gin Phe Asp Thr Asp Leu Thr Ser 

325 330 335 

Lys Asn Leu Glu He Lys Met Lys Pro His Gin Phe Glu Gin Leu Phe 



340 



345 



350 



Leu 



He 



Phe He 



Asp Asn 



Ala 



He 



Lys 



Tyr 



Asp 



Val 



Lys 
365 



Asn 



Lys 



Lys 



355 



360 
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lie Lys Val Lys Thr Arg Leu Lys Asn Lys Gin Lys lie lie Glu lie 

370 375 380 

Thr Asp His Gly lie Gly lie Pro Glu Glu Asp Gin Asp Phe lie Phe 
385 390 395 400 

Asp Arg Phe Tyr Arg Val Asp Lys Ser Arg Ser Arg Ser Gin Gly Gly 

405 410 415 

Asn Gly Leu Gly Leu Ser lie Ala Gin Lys lie lie Gin Leu Asn Gly 

420 425 430 

Gly Ser lie Lys lie Lys Ser Glu lie Asn Lys Gly Thr Thr Phe Lys 

435 440 445 

He He Phe 
450 

(2) INFORMATION FOR SEQ ID NO: 3: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 736 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 

ATTTACGTTT TGTCATCGTA TCACATACCC AACGCCACGA ACTGTTTCAA TCATTTTGTC 
60 

ACGATCGTAT GGTTTTAACT TGTTTCGTAA ATATCTTATA TAAACATCTA CGACATTTGT 
120 

TTCTACTTCA CTATTATAAC CCCATACATG ATTTAAAATT TGTTCCCGTT GCATAACATG 
180 

GTTTTTATTT TCAGCTAGAA GATATAGTAA ATCATACTCT GTTTTTGTTA ATTCAATTTC 
240 

TGCGCCATTT ACCGTCACTT TAAAAGCGTT CTTATCAATT GTAATACCGT TGACATCGAT 
300 

AATATCCTTT TGTGGCTGAC GACGTAAAAT TGCACGAATT CTTGCTAAAA GTTCTTCAAT 
360 

ATCAAACGGC TTAACTATAT AATCGTCTGC ACCGTAATCA AGCCCAGCAA CTTTGTCATA 
420 

CGTATCACTT TTCGCTGTAA TTATAATGAT AGGTGTAGAT TGTTGTTGTC TAATTTTGCG 
480 
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ACAAATTTCT AAGCCATTAA TTGACGGCAA CATTAAATCT AATATGATTA AATCATAGTA 
540 

ATGGCTAAGC GCTTTATCTA AACCGTCTTG TCCATCATAC TCTGTGTCCA CATTGTAATT 
600 

TTCATGTGTG AGTTCCAATT CAAGAAATCT TGCTAAGTTT TGTTCATCTT CTACTATTAA 
660 

AATTTGCGTC ATTTGTACAC CTCATATTAC GACTTTTTCT AATAAGGTAA TATATTTTAA 
720 

ATTTTGTATT TTTCTA 
736 



(2) INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 219 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



Met Thr Gin He Leu He 

1 5 
Leu Glu Leu Glu Leu Thr 

20 • 

Asp Gly Gin Asp Gly Leu 
35 

Leu Asp Leu Met 



He He 

50 
Arg Lys 
65 

Lys Ser 

Asp Tyr 

Arg Ala 

Gly He 
130 



He Arg Gin Gin 
70 

Asp Thr Tyr Asp 
85 

He Val Lys Pro 
100 

He Leu Arg Arg 
115 

Thr He Asp Lys 



Val Glu Asp Glu Gin Asn Leu Ala Arg Phe 

10 15 
His Glu Asn Tyr Asn Val Asp Thr Glu Tyr 

25 30 
Asp Lys Ala Leu Ser His Tyr Tyr Asp Leu 

40 45 
Leu Pro Ser He Asn Gly Leu Glu He Cys 
55 60 

Gin Ser Thr Pro He He He He Thr Ala 

75 80 
Lys Val Ala Gly Leu Asp Tyr Gly Ala Asp 

90 95 
Phe Asp He Glu Glu Leu Leu Ala Arg He 

105 110 
Gin Pro Gin Lys Asp He He Asp Val Asn 

120 125 
Asn Ala Phe Lys Val Thr Val Asn Gly Ala 
135 140 
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• # 

Glu lie Glu Leu Thr Lys Thr Glu Tyr Asp Leu Leu Tyr Leu Leu Ala 
145 150 155 160 

Glu Asn Lys Asn His Val Met Gin Arg Glu Gin lie Leu Asn His Val 

165 170 175 

Trp Gly Tyr Asn Ser Glu Val Glu Thr Asn Val Val Asp Val Tyr He 

180 185 190 

Arg Tyr Leu Arg Asn Lys Leu Lys Pro Tyr Asp Arg Asp Lys Met He 

195 200 205 

Glu Thr Val Arg Gly Val Gly Tyr Val He Arg 
210 215 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

ATGACAAAAC GTAAATTGCG CAATAAC 
27 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

AAATATGATT TTAAACGTTG TTCC 
24 
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